Improved canine exome designs, featuring ncRNAs and increased coverage of protein coding genes
نویسندگان
چکیده
By limiting sequencing to those sequences transcribed as mRNA, whole exome sequencing is a cost-efficient technique often used in disease-association studies. We developed two target enrichment designs based on the recently released annotation of the canine genome: the exome-plus design and the exome-CDS design. The exome-plus design combines the exons of the CanFam 3.1 Ensembl annotation, more recently discovered protein-coding exons and a variety of non-coding RNA regions (microRNAs, long non-coding RNAs and antisense transcripts), leading to a total size of ≈ 152 Mb. The exome-CDS was designed as a subset of the exome-plus by omitting all 3' and 5' untranslated regions. This reduced the size of the exome-CDS to ≈ 71 Mb. To test the capturing performance, four exome-plus captures were sequenced on a NextSeq 500 with each capture containing four pre-capture pooled, barcoded samples. At an average sequencing depth of 68.3x, 80% of the regions and well over 90% of the targeted base pairs were completely covered at least 5 times with high reproducibility. Based on the performance of the exome-plus, we estimated the performance of the exome-CDS. Overall, these designs provide flexible solutions for a variety of research questions and are likely to be reliable tools in disease studies.
منابع مشابه
Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein‐Coding Regions
For next-generation sequencing technologies, sufficient base-pair coverage is the foremost requirement for the reliable detection of genomic variants. We investigated whether whole-genome sequencing (WGS) platforms offer improved coverage of coding regions compared with whole-exome sequencing (WES) platforms, and compared single-base coverage for a large set of exome and genome samples. We find...
متن کاملEditorial: The Post-Exome Era
The Iranian Rehabilitation Journal (IRJ) invites research papers on the genetic basis of single gene and complex disorders. This vastly dynamic branch of science will complement the multidisciplinary wealth of expertise in the fields of social welfare and rehabilitation. The past few years have witnessed outstanding research projects on the genetic causes of numerous debilitating disorders, suc...
متن کاملDevelopment and performance of a targeted whole exome sequencing enrichment kit for the dog (Canis Familiaris Build 3.1)
Whole exome sequencing is a technique that aims to selectively sequence all exons of protein-coding genes. A canine whole exome sequencing enrichment kit was designed based on the latest canine reference genome (build 3.1.72). Its performance was tested by sequencing 2 exome captures, each consisting of 4 pre-capture pooled, barcoded Illumina libraries on an Illumina HiSeq 2500. At an average s...
متن کاملNon-coding RNAs in schistosomes: an unexplored world.
Non-coding RNAs (ncRNAs) were recently given much higher attention due to technical advances in sequencing which expanded the characterization of transcriptomes in different organisms. ncRNAs have different lengths (22 nt to >1,000 nt) and mechanisms of action that essentially comprise a sophisticated gene expression regulation network. Recent publication of schistosome genomes and transcriptom...
متن کاملIdentification and Comparative Analysis of ncRNAs in Human, Mouse and Zebrafish Indicate a Conserved Role in Regulation of Genes Expressed in Brain
ncRNAs (non-coding RNAs), in particular long ncRNAs, represent a significant proportion of the vertebrate transcriptome and probably regulate many biological processes. We used publically available ESTs (Expressed Sequence Tags) from human, mouse and zebrafish and a previously published analysis pipeline to annotate and analyze the vertebrate non-protein-coding transcriptome. Comparative analys...
متن کامل